{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 缓存篇\n",
    "\n",
    "参考：[一行代码实现Python加速：缓存篇](https://www.bilibili.com/video/BV1v34y167rc?spm_id_from=333.999.0.0)\n",
    "\n",
    "least recently used"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 计算斐波那契数列\n",
    "def fib(n):\n",
    "    if n == 0:\n",
    "        return 0\n",
    "    elif n == 1:\n",
    "        return 1\n",
    "    else:\n",
    "        return fib(n-2) + fib(n-1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "320 ms ± 8.83 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "fib(30)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [],
   "source": [
    "from functools import lru_cache"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 计算斐波那契数列\n",
    "@lru_cache(100)\n",
    "def fib(n):\n",
    "    if n == 0:\n",
    "        return 0\n",
    "    elif n == 1:\n",
    "        return 1\n",
    "    else:\n",
    "        return fib(n-2) + fib(n-1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "66.9 ns ± 1.43 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "fib(30)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# JIT\n",
    "\n",
    "just in time compilation，运行时编译，动态编译。\n",
    "\n",
    "准备数据 -》 编译部分代码 -》 再执行\n",
    "\n",
    "参考：[引入一行代码让python提速50倍:两个jit库介绍](https://www.bilibili.com/video/BV1k44y1E7iW?spm_id_from=333.999.0.0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## numba"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 使用蒙特卡罗方法计算pi\n",
    "def cal_pi(n):\n",
    "    t = 0\n",
    "    for _ in range(n):\n",
    "        x = random.random()\n",
    "        y = random.random()\n",
    "        if x ** 2 + t ** 2 < 1:\n",
    "            t += 1\n",
    "    return t / n * 4"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "492 ms ± 2.75 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "cal_pi(100_0000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [],
   "source": [
    "from numba import jit\n",
    "\n",
    "@jit(nopython=True)\n",
    "def jit_cal_pi(n):\n",
    "    t = 0\n",
    "    for _ in range(n):\n",
    "        x = random.random()\n",
    "        y = random.random()\n",
    "        if x ** 2 + t ** 2 < 1:\n",
    "            t += 1\n",
    "    return t / n * 4"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "11.3 ms ± 115 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "jit_cal_pi(100_0000)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## jax\n",
    "\n",
    "谷歌出品，主要针对机器学习，深度学习中的向量运算。\n",
    "\n",
    "\n",
    "```\n",
    "pip install jax\n",
    "pip install jaxlib\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from jax import jit as jax_jit"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "@jax_jit\n",
    "def mat(x, y, z):\n",
    "    for i in range(50):\n",
    "        x = x @ y + z\n",
    "    return x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "d = 1024\n",
    "x, y, z = np.random.random((d, d)), np.random.random((d, d)), np.random.random((d, d))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.06 s ± 170 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "mat(x, y, z)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 也是个装饰器，也可以当作一个高阶函数来使用\n",
    "jit_max = jax_jit(mat)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "WARNING:absl:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "400 ms ± 27.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "jit_max(x, y, z)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 多进程\n",
    "\n",
    "参考：[用一行代码实现python多进程加速](https://www.bilibili.com/video/BV1EZ4y1X7Aj?spm_id_from=333.999.0.0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "import multiprocessing as mp\n",
    "import random"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 计算pi的方法，既不准确又不高效\n",
    "def cal_pi(n):\n",
    "    t = 0\n",
    "    for _ in range(n):\n",
    "        x = random.random()\n",
    "        y = random.random()\n",
    "        if x ** 2 + t ** 2 < 1:\n",
    "            t += 1\n",
    "    return t / n * 4"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4.000000000000001e-06\n",
      "CPU times: user 4.86 s, sys: 18.2 ms, total: 4.88 s\n",
      "Wall time: 4.9 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "pi = 0\n",
    "for i in range(10):\n",
    "    # 计算100万次\n",
    "    pi += cal_pi(100_0000)\n",
    "print(pi / 10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 12.4 ms, sys: 28.2 ms, total: 40.5 ms\n",
      "Wall time: 1.8 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "# 使用10个进程运行\n",
    "rets = mp.Pool(4).map(cal_pi, [100_0000] * 10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[4e-06, 4e-06, 4e-06, 4e-06, 4e-06, 4e-06, 4e-06, 4e-06, 4e-06, 4e-06]"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rets"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4.000000000000001e-06\n",
      "CPU times: user 9.86 ms, sys: 20 ms, total: 29.8 ms\n",
      "Wall time: 1.77 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "pi = 0\n",
    "for r in mp.Pool(4).imap(cal_pi, [100_0000] * 10):\n",
    "    pi += r\n",
    "print(pi / 10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [],
   "source": [
    "from tqdm import tqdm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 10/10 [00:04<00:00,  2.02it/s]\n"
     ]
    }
   ],
   "source": [
    "# 增加进度条的显示\n",
    "pi = 0\n",
    "for i in tqdm(range(10)):\n",
    "    pi += cal_pi(100_0000)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "10it [00:01,  6.22it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "4.000000000000001e-06\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "pi = 0\n",
    "for r in tqdm(mp.Pool(4).imap(cal_pi, [100_0000] * 10)):\n",
    "    pi += r\n",
    "print(pi / 10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## pandarallel"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "from pandarallel import pandarallel"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>a</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   a\n",
       "0  0\n",
       "1  1\n",
       "2  2\n",
       "3  3\n",
       "4  4"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = pd.DataFrame({'a': list(range(10))})\n",
    "\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "import time \n",
    "\n",
    "\n",
    "def my_time_consuming_function(x):\n",
    "    time.sleep(1)\n",
    "    return x * 10"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0     0\n",
       "1    10\n",
       "2    20\n",
       "3    30\n",
       "4    40\n",
       "5    50\n",
       "6    60\n",
       "7    70\n",
       "8    80\n",
       "9    90\n",
       "Name: a, dtype: int64"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['a'].apply(lambda x: my_time_consuming_function(x))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "INFO: Pandarallel will run on 8 workers.\n",
      "INFO: Pandarallel will use standard multiprocessing data transfer (pipe) to transfer data between the main process and workers.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "0     0\n",
       "1    10\n",
       "2    20\n",
       "3    30\n",
       "4    40\n",
       "5    50\n",
       "6    60\n",
       "7    70\n",
       "8    80\n",
       "9    90\n",
       "Name: a, dtype: int64"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pandarallel.initialize()\n",
    "df['a'].parallel_apply(my_time_consuming_function)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "aee8b7b246df8f9039afb4144a1f6fd8d2ca17a180786b69acc140d282b71a49"
  },
  "kernelspec": {
   "display_name": "Python 3.9.4 64-bit",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
